Nonparametric learning rules from bandit experiments: The eyes have it!

نویسندگان

  • Yingyao Hu
  • Yutaka Kayaba
  • Matthew Shum
چکیده

Article history: Received 22 February 2012 Available online 30 May 2013 JEL classification: D83 C91 C14

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Kernel Estimation and Model Combination in A Bandit Problem with Covariates

Multi-armed bandit problem is an important optimization game that requires an explorationexploitation tradeoff to achieve optimal total reward. Motivated from industrial applications such as online advertising and clinical research, we consider a setting where the rewards of bandit machines are associated with covariates, and the accurate estimation of the corresponding mean reward functions pl...

متن کامل

Stochastic Game under Unknown Environment | a Strategy for Nonparametric Lob-Pass Problem

We treat an on-line learning model named lob-pass problem, that is an extension of the bandit problem. The nonparametric case is considered, and a class of strategies which can obtain O(t ) cumulative regret for arbitrary > 0 is constructed. It is also shown that no strategy can achieve O(log t).

متن کامل

Reinforcement learning and evolutionary algorithms for non-stationary multi-armed bandit problems

Multi-armed bandit tasks have been extensively used to model the problem of balancing exploitation and exploration. A most challenging variant of the MABP is the non-stationary bandit problem where the agent is faced with the increased complexity of detecting changes in its environment. In this paper we examine a non-stationary, discrete-time, finite horizon bandit problem with a finite number ...

متن کامل

A Q-learning Based Continuous Tuning of Fuzzy Wall Tracking

A simple easy to implement algorithm is proposed to address wall tracking task of an autonomous robot. The robot should navigate in unknown environments, find the nearest wall, and track it solely based on locally sensed data. The proposed method benefits from coupling fuzzy logic and Q-learning to meet requirements of autonomous navigations. Fuzzy if-then rules provide a reliable decision maki...

متن کامل

Cognitive Capacity and Choice under Uncertainty: Human Experiments of Two-armed Bandit Problems

The two-armed bandit problem, or more generally, the multi-armed bandit problem, has been identified as the underlying problem of many practical circumstances which involves making a series of choices among uncertain alternatives. Problems like job searching, customer switching, and even the adoption of fundamental or technical trading strategies of traders in financial markets can be formulate...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Games and Economic Behavior

دوره 81  شماره 

صفحات  -

تاریخ انتشار 2013